Practical and flexible pattern matching over Ziv–Lempel compressed text
نویسندگان
چکیده
منابع مشابه
Compressed Pattern Matching for Text
The amount of information that we are dealing with today is being generated at an everincreasing rate. On one hand, data compression is needed to efficiently store, organize the data and transport the data over the limited-bandwidth network. On the other hand, efficient information retrieval is needed to speedily find the relevant information from this huge mass of data using available resource...
متن کاملA General Practical Approach to Pattern Matching over Ziv-Lempel Compressed Text
We address in this paper the problem of string matching on Lempel-Ziv compressed text. The goal is to search a pattern in a text without uncompressing. This is a highly relevant issue, since it is essential to have compressed text databases where eecient searching is still possible. We develop a general technique for string matching when the text comes as a sequence of blocks. This abstracts th...
متن کاملDirect Pattern Matching on Compressed Text
We present a fast compression and decompression technique for natural language texts. The novelty is that the exact search can be done on the compressed text directly, using any known sequential pattern matching algorithm. Approximate search can also be done ee-ciently without any decoding. The compression scheme uses a semi-static word-based modeling and a Huu-man coding where the coding alpha...
متن کاملPhrase-Based Pattern Matching in Compressed Text
Byte codes are a practical alternative to the traditional bit-oriented compression approaches when large alphabets are being used, and trade away a small amount of compression effectiveness for a relatively large gain in decoding efficiency. Byte codes also have the advantage of being searchable using standard string matching techniques. Here we describe methods for searching in byte-coded comp...
متن کاملMultiple Pattern Matching in LZW Compressed Text
In this paper we address the problem of searching in LZW compressed text directly, and present a new algorithm for finding multiple patterns by simulating the move of the Aho-Corasick pattern matching machine. The new algorithm finds all occurrences of multiple patterns whereas the algorithm proposed by Amir, Benson, and Farach finds only the first occurrence of a single pattern. The new algori...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Discrete Algorithms
سال: 2004
ISSN: 1570-8667
DOI: 10.1016/j.jda.2003.12.002